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Abstract 

We estimate the probability that random noise, of several plausible standard distributions, 
creates a false alarm that a periodicity (or log-periodicity) is found in a time series. The solution 
of this problem is akeady known for independent Gaussian distributed noise. We investigate 
more general situations with non-Gaussian correlated noises and present synthetic tests on the 
detectability and statistical significance of periodic components. A periodic component of a 
time series is usually detected by some sort of Fourier analysis. Here, we use the Lomb peri- 
odogram analysis which is suitable and outperforms Fourier transforms for unevenly sampled 
time series. We examine the false-alarm probability of the largest spectral peak of the Lomb 
periodogram in the presence of power-law distributed noises, of short-range and of long-range 
fractional-Gaussian noises. Increasing heavy-tailness (respectively correlations describing per- 
sistence) tends to decrease (respectively increase) the false-alarm probability of finding a large 
spurious Lomb peak. Increasing anti-persistence tends to decrease the false-alarm probability. 
We also study the interplay between heavy-tailness and long-range correlations. In order to 
fully determine if a Lomb peak signals a genuine rather than a spurious periodicity, one should 
in principle characterize the Lomb peak height, its width and its relations to other peaks in 
the complete spectrum. As a step towards this full characterization, we construct the joint- 
distribution of the frequency position (relative to other peaks) and of the height of the highest 
peak of the power spectrum. We also provide the distributions of the ratio of the second highest 
Lomb peak to the maximum peak. Using the insight obtained by the present statistical study, we 
re-examine previously reported claims of "log-periodicity" and find that the credibility for log- 
periodicity in 2D-freely decaying turbulence is weakened while it is strengthened for fracture, 
for the ion-signature prior to the Kobe earthquake and for financial markets. 

1 Introduction 



The problem of detecting oscillatory components in noisy data is one of the most general problems 
found in all scientific disciplines. Without being exhaustive, the search of periodic signals in com- 
plex time series goes from astrophysics (for instance in the detection of planets outside our solar 
system), geosciences (for instance in the detection of cycles in meteorology and climate dynamics), 
medicine (for instance in the effect of circadian cycles), to economics (for instance in the business 
cycles or in the calendar effects occurring at intra-day, week, month and year time scales in financial 
times series). It is not stretching reality to state that almost all fields have at least one outstanding 
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question related to the detection of tiny periodic oscillations in an otherwise noisy and complex 
background. Many techniques have been invented, adapted or improved to address the specificities 
of each situation. 

The most standard tool for detecting a periodic component in a noisy time series is the Fourier 
transform (FT) and, when possible, the fast Fourier transform (FFT). 

The FT and FFT methods require evenly sampled signals. However, due to measurement con- 
straints, it can happen that the time series is not evenly spaced in time. A straightforward approach 
is to reconstruct evenly spaced data using interpolation or rebinning and then apply the FFT method. 
However, interpolation and rebinning both introduce some distortion and may lose some informa- 
tion in situations where the periodic signal is weak and difficult to distinguish from noise. 

In addition, the FT and FFT methods perform well only when the the data size is relatively large 
so as to minimize aliasing (window border) distortions at the extremities of the time series. Wavelet 
techniques and singular spectral analysis have been developed to address in part this problem. 

A completely different method for detecting periodic components in an unevenly sampled and 
relatively short signal is the Lomb periodogram The Lomb periodogram analysis performs 
local least-squares fit of the data by sinusoids centered on each data point of the time series [^. 
Given a unevenly spaced signal y{ti), where i = 1,2, - ■ ■ ,N, the normalized Lomb periodogram is 
calculated according to the following formula [§, ||, [I]] : 



[Ell y{ti) cos LuiU-T)]^ , [j:tiy{ti)smu;{ti-T)] 



+ 



where r is defined by the equation 




(1) 



tan(2tL'r) = sin 2u!ti / cos 2ujti , (2) 



and is the total variance of the data (including noise and signal)[[ The normalized Lomb peri- 
odogram P/v(cl') is similar to a FT power spectrum in which the presence of peaks at certain angular 
frequencies Wn signals the possible existence of periodic components. The largest peaks point to 
the most important frequencies in the time series, the higher is a peak, the larger is the statistical 
significance of the corresponding periodic component. Due to its local least-square fit nature, the 
Lomb periodogram analysis works equally well for unevenly spaced time series. It is also much 
less prone to aliasing (window border) distortions in short time series. 

To understand intuitively how the expression (|T]) allows one to detect periodic components, let 
us consider a noisy signal y{t) made of a finite number of independent periodic functions plus a 
noise x{t) of zero mean which is independent of the periodic components: 

M 

y{t) = ^ ymCos{ujmt) + x{t) . (3) 

m=l 

Using the independence between the noise and the periodic signals, we obtain (y) = and cj^ = 
(y^) = J2i^ ymf^ + ^x- Since the translation r in Eq. ([I]) provides usually a very minor correction, 
we can ignore it here. In addition, we assume that the tj's are not badly bunched [||]. Given these 
two conditions, the Lomb periodogram at angular frequency uj„i can be written as 

M „,2 o„2\-l 



PnM = Y — ^ + • (4) 



^The formula uses the total variance of the data which is generally available rather than the variance of the noise 
derived either from the residuals after subtracting a signal or from the uncertainty of the measurements p|]. 
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This result (^) emphasizes the importance of normalizing the periodogram by the total variance of 
the data. It also shows that the Lomb peak height is approximately proportional to the number N of 
points and to the square of the the amplitude of a given periodic component. Note that Eq. (Q) 
holds approximately for all types of noises, not only for Gaussian noise. Numerical verifications 
for the case of a single noisy sine signal, i.e. M = 1, have been presented in Ref. [^]. 

A severe drawback when using Eq. (|l|) directly is computational: the time needed to calculate 
a Lomb periodogram typically grows as N'^ compared to In N for the EFT. To fix this, a fast 
computation approach of the Lomb periodogram has been proposed [Q, The method uses the 
EET, but it is not a EET periodogram of the data. In this method, Pat (a;) can be calculated with 
any desired precision and the time needed is only of order In A^. This is the method used in the 
present work. 

However, the presence of a peak in the Lomb periodogram does not guarantee the presence of a 
genuine periodic signal as random fluctuations can create peaks by chance alone. What is needed is 
a quantitative assessment of the probability that peaks can be generated by random fluctuations. The 
statistical significance of a given peak is the probability that it does not stem from noise. In this goal, 
previous works ^ have defined a null hypothesis, which assumes that a given peak observed in the 
Lomb periodogram stems from independent Gaussian noise. Here, we shall develop a hierarchy of 
more sophisticated null hypothesis. 

If the signal y{ti) is a pure noise, namely the noise is independently normally distributed then, 
at any particular lu, it can be shown that the normalized periodogram P/v {^) has an exponential 
probability distribution with unit mean [^. In other words, the probability that the height z = 
P/v {^) of the Lomb periodogram at any given frequency is found between some positive z and 
z + dz is e~^dz [^], independently of the amplitude of the Gaussian noise. This independence with 
respect to the noise amplitude stems from the normalization explicited in equations (^^. If we scan 
some M independent frequencies a;2, wa/, the probability that none of these M frequencies 
give a corresponding value of the Lomb periodogram Pn (toi) , Pn {^2) , Pn{^m) larger than z 
is (1 — e~^) . Therefore, the probability for any Lomb peak to exceed z is 

Pr(>z) = l-(l-e^^)^^ (5) 

which defines the false-alarm probability of the null hypothesis. The significance level of any peak 
Pn (w) is thus 1 — Pr (> z). A small value for the false-alarm probability indicates a highly 
significant periodic signal. The interesting region is where the false-alarm probability is small such 
that equation (||) can be expanded to give 

Pr{> z)^ Me-' . (6) 

In order to estimate the false-alarm probability, extensive Monte Carlo simulations have been car- 
ried out to determine the values of M in various cases |^. In those simulations [^], all data sets 
were Gaussian noise with different spacings in time: (1) evenly spaced data in time; (2) time in- 
crements drawn as uniformly distributed random numbers in (0,1); and (3) increments clumped in 
groups. These works ^ ||] thus aimed at testing the impact of different non-uniform sampling rates, 
rather than the effect of non-Gaussian non-white noise as studied here. 

Our interest in this problem is motivated by the problem of detecting so-called log-periodic 
oscillations in self-similar systems, which are the signatures of the symmetry of discrete scale in- 
variance (DSI) The hallmark of self-similarity is the presence of a power law followed by an 
observable F{x) as a function of the scale x of observation or of the distance x to the critical point. 
The hallmark of DSI is the existence of oscillations that are periodic in the logarithm of x, hence 
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the name "log-periodic". As a complement to fitting an observable with a power law decorated by 
log-periodic oscillations, a non-parametric detection of log-periodicity is of great interest to test for 
the presence of DSI and was introduced in [|6[ ^ ||] based on the Lomb periodogram. 

Such a method indeed requires the use of the Lomb periodogram analysis because, by con- 
struction, the sampling is non-uniform since the analysis must be performed in the variable In x. 
In addition, only a few log-periodic oscillations are usually available, since each fully completed 
oscillation corresponds to a multiplication of the observable by a preferred scaling ratio A > 1. 
Hence, n oscillations correspond to spanning the interval [xo,a;oA"], where xq is some constant. 
Thus many decades of data are required to observe only a few oscillations. In order to perform 
the Lomb periodogram of an observable exhibiting scale invariance, one can either subtract a pure 
power law fit from the data and analyze the residue with the Lomb periodogram method in terms 
of the variable In x [^, 10] or define a local exponent by constructing a local logarithmic derivative 
dhiF/dhix. Then, the potential oscillatory structure of the fluctuations of this local exponent as a 
function of Inx can be analyzed [|ll|]. 

The quantification of the false-alarm probability given by (|5|) and is based on the assumption 
that the noise is Gaussian and uncorrelated, as mentioned above. In practice, this is almost always 
wrong. The present work investigates how the false-alarm probability Pr (> z) (and therefore the 
confidence level 1 — Pr {> z)) changes from the predictions @ and in the presence of non- 
Gaussian noise and of dependence between successive innovations (increments) of the time series. 
If one knows some of the characteristics of the noise distribution and of its dependence structure, 
a more refined null hypothesis asks the following question: what is the false-alarm probability of 
the highest peak of a Lomb periodogram performed on a pure noise with the same distribution and 
same dependence structure? 

Two recent papers [^, [I^ have already partially addressed this question. Ref. identified a 
novel source for periodicity relying solely on the manipulation of data: (1) approximate uniform 
sampling and (2) use of cumulative or integrated quantities which reddens the noise (increases the 
power spectrum at low frequencies) and, in a finite sample, creates a maximum in the spectrum 
leading to a most probable frequency. It was found that (i) the frequencies of the oscillations have a 
power spectrum given approximately by (ii) the most probable frequency scales as the inverse 
of the interval size; (iii) the amplitude of the periodic oscillations scales as the inverse square root 
of the number of data points (central limit theorem). These properties were studied in the context 
of the search for log-periodicity in earthquake aftershock time series, where the natural variable is 
\n t. Ref. [ 10] presented a battery of synthetic tests performed to quantify the statistical significance 
of the existence of log-periodic oscillations in cumulative seismic data up to major earthquakes. 
By defining synthetic tests that are as much as possible identical to the real time series except for 
the property to be tested, namely log-periodicity, Ref. [10] found that log-periodic oscillations with 
frequency and regularity similar to those of the real seismic data are likely to be generated by the 
interplay of the low pass filtering step due to the construction of cumulative functions together with 
an approximate power law acceleration. 

As a prerequisite of the analysis of log-periodicity in many systems (growth processes, rupture, 
earthquakes, turbulence, finance, etc), it is necessary to extend the previous investigation on the 
Lomb analysis to non-Gaussian noise with the possible addition of correlated noise. In the sequel, 
we examine the following three cases: 1) noises with heavy tails but no correlations, 2) Gaussian 
noises with correlations, and 3) noises with both non-Gaussian heavy tails and correlations. We 
perform extensive numerical simulations to determine the effects of either heavy-tailness or corre- 
lations on the significance level of the peaks. 

Our presentation is organized as follows. We investigate the effect of heavy-tailness in noises 
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on the significance level of the signals in Sec. ^ We analyze two types of noises, the first class 
characterized by Levy stable distributions and the second one distributed according to general power 
law distributions. In Sec. ^, we consider fractal-Gaussian noises with long-range corrections (both 
persistent and anti-persistent). The interplay between heavy-tailness and correlations in noises is 
studied in Sec. ^ Together with the heavy-tailed noise distributions, we consider both short-range 
exponentially decaying correlations as well as very long-range correlation such as in fractional 
Gaussian motions. We consider the joint distribution of the frequency and height of the highest 
peak of the Lomb power spectrum in Sec. ^ Application of the numerical results to real systems 
are discussed in Sec. ^. We summarize our results in Sec. 0. 



2 Effect of heavy-tailness 
2.1 Levy stable noise 



We consider an uncorrelated Levy noise 1 12], in which each innovation (increment) is independently 
drawn from the same Levy distribution. There is no closed analytical expression for the probability 
function of the stable distributions with a few exceptions, e.g., Cauchy distribution (q = 1) and 
Gaussian distribution {a = 2) |jT3|]. Stable distributions are given in terms of their characteristic 
function Jl4|]. We denote X ~ i?"(A, fi, a) or B'^{x; A, fi, a) such a Levy stable distribution. The 
symbol means that the random variable X is drawn from the process -B"(A, /i, a). This process 
is specified by four parameters: a is the stable characteristic exponent (0 < a < 2), A is the 
symmetry parameter (— 1 < A < 1), a is the scale factor {a > 0), and fi is the location parameter. 
When a = 2, the distribution is Gaussian with mean fi and variance 2(T^, and A has no effect. The 
tail probability distribution in the non-Gaussian cases a < 2 is known asymptotically as [ |T4| , [T5| ] : 

Pr(S" > x) ~ a'^^-^CaX-'', (7) 

where C„ = tt^ — r— ^7 — 77^, if a 7^ 1. Hence, stable distributions have heavy tails: the lower the 
value a is, the more extended the tails become and the greater is the probability of extreme events. 



In the case of A = 0, we have the expression of symmetric Levy stable distributions Qlq , |17| ]: 

1 

f{x) = — exp(— erg") cos{qx)dq. (8) 

TT JO 



Our generation algorithm is based on the so-called CMS method [18], which is applied in such 
a way that x will have the characteristic function ^{9) of the Levy stable distribution [|l^]. With 
this parameterization, the stable distribution B°'{x; A, /i, a) equals -B"(2^; A, 0, 1) [||]. We used 
the program provided by McCuUoch 0. In our simulations, we fixed A = 0, = and = ^rid 
changed a from 0. 1 up to 2 with step 0. 1 so that we can compare our results with those of Gaussian 
white noise. Four typical noises with 10,000 data points are illustrated in Fig. [T]: (a) a = 2, (b) 
a = 1.5, (c) a = 1, and (d) a = 0.5. For each a, we generated 50,000 data sets each with 100 data 
points. The unevenly spaced sampling of times which is adopted throughout this paper is generated 
as follows: one first generates 100 uniformly distributed random numbers between and 1, and 
then obtains the cumulative sums that are used as the sampling times ti, • • • , iioo- By construction, 
the sampling intervals At = tj+i — ti are uniformly distributed. We performed the Lomb analysis 
on each data set. 



^Available at tittp://economics. sbs.ohio-state.edu/jhm/jhm.htm" 



Fig. 1^ shows the four Lomb periodograms corresponding to the four time series shown in Fig. |[ 
It is visually suggesting that the spontaneous emergence of peaks above the background becomes 
less and less probable for smaller a. This is confirmed by extensive statistical tests synthetized in 
Fig. ^ which plots the false-alarm probability as a function of the Lomb peak for different exponents 
0.1 < a < 2. For reference and comparison, the Gaussian case is represented by the open circles. 
Within statistical fluctuations, it is identical to the last curve to the right corresponding to a = 2, 
as it should. Remarkably, for a given Lomb peak height, the false-alarm probability increases with 
a. Hence, a Levy noise with heavier tail (smaller exponent a) results in a higher significance for a 
given Lomb peak height. In other words, it is more difficult to generate a large peak in the Lomb 
periodogram analysis when the noise has more extended tail. Intuitively, the larger amplitudes of the 
noise fluctuations increase the effective randomness of the noise and destroy possible oscillations 
constructed by random occurrences. In other words. Fig. [I| suggests that, the smaller a is, the 
smaller is the number of points dominating the overall signal: for instance the last panel for a = 
0.5 shows that only a few very large peaks dwarf the rest of the population of noise innovations 
(increments). Since there is very little structure, there is little chance for a periodic component to 
appear, hence the small false alarm probability. 

The semi-logarithmic representation of Fig. ^ suggests that the false-alarm probability is ap- 
proximately an exponentially decaying function of the Lomb peak, similarly to the Gaussian case 
described by Eq. (^. This suggests to fit the dependence of the false-alarm probability for various 
a as an exponential of the Lomb peak height 

Pr^{z) = M(a)e-'=(")^ , (9) 

with an effective number M{a) of independent frequencies and a decay rate k{a), which are ex- 
pected to be respectively increasing and decreasing functions of a. 

In order to obtain more reliable estimations of M{q.) and k{a), we excluded the largest z's 
from the fit with Eq. because they are spoiled by large fluctuations. The scaling range was 
fixed in the interval Pra{z) G [0.001,0.05]. The upper bound 0.05 ensures the validity of the 
approximation (^. The lower bound 0.001 is a good compromise to have both a large range for 
the fit and to exclude the largest fluctuations occurring for the smallest Pra{z). The error bars 
for k{a) are estimated by evaluating the standard deviation over 12 subintervals constituted of the 
ten subintervals resulting from subdividing the range [0.001,0.05] equally in ten parts and adding 
two adjoining subintervals of the same size, one below the lower bound and one above the upper 
bound. The results are listed in Table [T[ We find M(2) = 104 and k{2) = 1.14 ± 0.19, which are 
consistent with the results M = 119 and k = 1 known for independent Gaussian noise [|]. Table 
|l| shows a systematic decrease of k{a) as a increases, which is capturing the observation that the 
false-alarm probability is increasing with a. We find however that the determination of the effective 
number M(a) of independent frequencies is not reliable as its determination is very sensitive to the 
value k{a). Therefore, we cannot conclude about a specific dependence of M{a) as a function of 
a. This limitation is relatively minor as this pre-exponential factor M (a) has a weak impact on the 
false-alarm probability (a factor two or three at most). 

2.2 Symmetrical power-law noises 

We now consider noises distributed with density distribution functions decaying as pure power laws 
according to 

\y\ > 1 



/(y)= ri;;' (10) 

0, |y| < 1 ■ 
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a 


0.1 


0.2 


0.3 


0.4 


0.5 


k 


3.06±0.72 


2.89±0.74 


2.69±0.40 


2.42±0.41 


2.12±0.25 


M 


32 


90 


154 


151 


90 


a 


0.6 


0.7 


0.8 


0.9 


1.0 


k 


1.98±0.27 


2.04±0.29 


1.89±0.36 


1.73±0.24 


1.59±0.18 


M 


97 


224 


185 


133 


92 


a 


1.1 


1.2 


1.3 


1.4 


1.5 


k 


1.59±0.24 


1.50±0.33 


1.49±0.15 


1.39±0.17 


1.33±0.23 


M 


150 


121 


166 


131 


112 


a 


1.6 


1.7 


1.8 


1.9 


2.0 


k 


1.27 ±0.11 


1.19±0.15 


1.16±0.12 


1.12±0.20 


1.14±0.19 


M 


101 


78 


85 


80 


104 



Table 1: List of the parameters k{a) and M(a) of the fit of the false-alarm probability Pra{z) by 
expression (^. 



These distributions complement the results obtained using the Levy stable distributions of the pre- 
vious section by 1) testing for the role of a strong difference in the bulk part of the distributions and 
2) allowing for the exponent k to be larger than 2. 

We first generate uniformly distributed random numbers x in (0, 1), and then substitute them 

into 

-{2x)~^''^, < X < 0.5, 
[2(1 - x)]"^/'', 0.5<x<l. 

In the simulations, we consider values of k from 0.1 to 2 with spacing 0.1, from 2.5 to 5 with 
spacing 0.5, and the values k = 6 and k = 7. Four typical noises with 10,000 data points are 
illustrated in Fig. 0: (a) k = 5, (b) k = 2, (c) k = 1.5, and (d) k = 0.5. For each k, we 
generated 50,000 data sets, each with 100 data points, as in Sec. 2.1. Figure |5] plots the false-alarm 
probability as a function of the Lomb peak for different exponents 0.1 < k < 7. For reference 
and comparison, the Gaussian case is represented by the open circles. The results are very similar 
to those found for the Levy noise shown in Fig. ^ As the exponent k increases, the false-alarm 
probability increases: the thinner the tail is, the smaller is the confidence level of a given Lomb 
peak. Intuitively, the results are as for the Levy noise: largest fluctuations for small k introduce a 
larger effective randomness making less probable the occurrence of oscillations. Similarly to the 
argument proposed in the previous section. Fig. § suggests that, the smaller k is, the smaller is the 
number of points dominating the overall signal. As a consequence, the time series will have little 
structure, hence the small false-alarm probability. 

In contrast with the Levy noise for which the false-alaim curves accumulate at the Gaussian 
curve for q = 2, the false-alarm curves for the power law noises continue to translate to the right 
as K increases. The reason is that k = 2 for a power law distribution is not the same as a = 2 for 
a Levy stable distribution: the former case corresponds to a power law distribution ~ \y\~^ while 
the latter corresponds to the Gaussian distribution. Interestingly, for k > 3, we observe that the 
false-alarm probabilities are larger than for the Gaussian case. In this sense, uncorrelated noises 
with power law distributions with tail exponent k > 3 are "less random" than the Gaussian case as 
oscillations caused by random occurrences are more frequent. 
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The reason why a more extended tail leads on average to smaller Lomb peaks, such that a given 
peak height has stronger false-alarm probability (smaller statistical significance) for the Gaussian 
noise than for Levy noise, can be easily understood from the following argument. Qualitatively, a 
smaller exponent leads to larger fluctuations which introduce a larger effective randomness mak- 
ing less probable the occurrence of oscillations. Quantitatively, this "larger randomness" can be 
measured by the sample-size-dependence of the empirical variance entering the height of the 
Lomb peak according to expression (^. Let us present our argument for the power laws ( [Io| ) for 
which the explicit dependence on the exponent is simpler. Here, we use the same notation a = k to 
stress that the argument applies in general to distributions with power law tails. Consider the case 
a < 2 for power laws for which the variance does not exist from a mathematical point of view. 
Notwithstanding this fact, one can always calculate an empirical variance from each specific time 
series, as 

1 ^ 



N 



-Y.^Xi-{xi)f , (12) 



where Xj is the ith point of the synthesized Levy stable noise. The notation (xj) refers also to the 
empirical estimation of an average from the empirical time series. The fact that the variance (and 
the mean for a < 1) does not exist is reflected in the fact that the estimated variance grows with 
data size N and exhibits large fluctuations, is scaling as 

oc Na 4^ oc ^ fiV^ - l) , (13) 



a 

where we have estimated the typical largest value Xmax by the standard condition 

POO 

Na / dx x"^-" = 1 , (14) 

leading to the estimation x^ax = N^^"'- Expression (|l^) expresses the fact that the product of the 
probability to be of the order of or larger than Xmax by the total number of points is of the order 1, 
which defines Xmax- Figure ^ shows the maximal Lomb peak Pn{(^) averaged over 50,000 real- 
izations of 100 points as a function of a"^ given by expression (p3|). Here a"^ is varied by changing 
the power law exponent from a = 0.1 to 1.9. Increasing a decreases and thus corresponds to 
reading figure ^ from right to left. Figure ^ shows directly the dependence of P/v(w) as a function 
of 0.1 < a < 1.9 in the same range. This rationalizes our empirical finding that the Lomb peak 
Pjv(l(j) and as well as the maximal peak increases with a. This is in line with the results presented 
in Figs. Island S. We will observe a similar effect in GARCH(1,1) noise discussed in Sec. 4^.1. 



3 Effect of correlations 

3.1 Fractional Gaussian noises 

We now study the effect of correlations included in fractional Gaussian noise (fGn) Xnit) of pa- 
rameter H. The cumulative sum Buit) of fGn defines the fractional Brownian motion (fBm) which 
was introduced by Mandelbrot and Van Ness [|^] as an extension to the usual Brownian motion. 
A process Bnit) is a fBm if it is a Gaussian random process which has stationary and self-similar 
increments, 

XH{t) = BH{t + l)-BH{t) , (15) 
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which form a fractional Gaussian noise (fGn). The value H 



yields the ordinary Brownian 



motion, which is known to be non-persistent with absence of correlations of the increments X112 (t). 
For < H < the fBm is antipersistent, while for ^ < i7 < 1, the fBm is persistent |21]. 

There are numerous synthesis methods proposed to generate fBm, and hence fOn. They include 
the random midpoint displacement method [21, 22, 2^, the Cholesky decomposition method of co- 



variance matrix of vector increments using the Durbin-Levinson algorithm [24, 25, 26 1, the inverse 



Fourier transform method based on power spectrum [|27|, the fast Fourier transform method 
matching covariance structure instead of frequency spectrum [29, |^], more recently the wavelet 
transform method [|l], |3|, and others [Q 

We adopted the Cholesky-Levinson factorization method^, which is one of the best methods 
among those used to synthesize Gaussian time series [|3^]. This algorithm for generating fBm 
performs a linear transformation of i.i.d. Gaussian random variables in three steps: generation of 
independent Gaussian random numbers, multiplication of these random numbers by weights that are 



recursively determined by the covariance structure, and construction of their running sum [|25[]. The 
fGn are the innovations (increments) of fBm. In the sequel, we analyze the statistical significance 
of the Lomb periodogram of fGn. 



3.2 Numerical results 

In our simulations, the Hurst index ranges from 0.1 to 0.9 with spacing 0.1. For each H, we 
generated 50,000 fGn time series. The length of each time series is again 100 data points. Figure |8| 
shows four fractional Gaussian noises: (a) H = 0.2, (b) H = 0.5, (c) H = 0.7, and (d) H = 0.9. It 
is visually apparent that, the larger is H, the clearer is the presence of oscillations within the noise. 

We performed the Lomb analysis on each fGn time series and extracted the largest peaks in the 
Lomb periodograms. The dependence of the false-alarm or rejection probability as a function of 
the Lomb peak height for different types of noises is shown in Fig. |^ For H = 0.5, we recover the 
exponential decay represented by the open circles. As H increases, the false-alarm probability 
increases very significantly and also exhibits important deviations from an exponential fall-off. 

For H < 0.5, the fGn is anti-persistent. Thus, a point with negative (resp. positive) value in 
the series is usually followed by a point with positive (resp. negative). This effect favors a random 
oscillatory pattern at the highest possible frequency corresponding to a period of two points but the 
number of independent frequency is thus small. For smaller frequencies, this antipersistence makes 
improbable the spontaneous formation of other oscillatory regularities. In contrast, the fGn with 
> 0.5 is persistent. Thus a point of negative (resp. positive) value in the series is usually followed 
by a point of negative (resp. positive) value. Hence, there are more regularities and structure than 
for the Gaussian white noise. Long-range correlations create strong artifactual periodicity. This 
recovers while generalizing to different noise spectra the results of [^, 10]. A larger H implying a 



larger correlation between the successive data points, and hence stronger regularities, it is natural to 
obtain a higher Lomb peak. This is true for the power spectra of the fGns proportional to 1//^ with 
f3 = 2H — 1 < 2 which corresponds to long-range correlated stationary noises analyzed here. This 
is also true for noise spectral with exponent /? > 2 for which the processes are not stationary (as in 
a random walk whose standard deviation grows as \/t) with correlation functions blowing up. 



'^We used the MATLAB script n amed fbmlevinson.m included in FracLab available at h ttp://www- 
rocq.inria.fr/fractales/Software/FRACLAB/. 



9 



4 Interplay between heavy tail and correlations 



4.1 The GARCH(1,1) process 

4.1.1 Method for generation of GARCH(1,1) residuals 

Autoregressive Conditional Heteroscedasticity (ARCH) process was introduced by Engle in 1982 



[36] to account for the so-called heteroscedasticity in economic time series. Heteroscedasticity 
stands for the lack of stationary "volatility" (which is the term for standard deviation using in 
finance), i.e., the presence of periods of time with large volatilities alternating with periods of small 
volatility. In an ARCH(g) process the volatility at time t is a function of the observed data at t — 1, 



t - I, t - q. In 1986, Bollersev [|33] introduced the Generalized ARCH or GARCH(p,g) 
process, where the volatility at time t depends on the observed data at t — 1, t — 1, • • •, t — g, as 
well as on volatilities ait — l,t — \, ■ ■ -,1 — p. Here, we focus our attention on one of the standard 
benchmark model of financial time series, the GARCH(1, 1) process, which obeys the following 
evolution equation: 

xt = fi + et , (16) 

= VhfZt , (IV) 

ht = a + peti + jht-i , (18) 

where zt is a standardized, independent, identically distributed (i.i.d.) random variable drawn from 
some specified probability distribution: 

{zt)=0 and {zt^) = l. (19) 

The GARCH models have been shown to capture not only volatility clustering but also accom- 
modate some of the leptokurtosis (i.e., heavy tails) commonly found in stock market and currency 
exchange rate time. However, GARCH models with conditionally normal errors generally fail to 
sufficiently capture the leptokurtosis evident in asset returns. The increasing attention focused on 
distributional properties (particularly tail heavyness), when estimating exchange rates models, has 
led to the widespread adoption of non-Gaussian conditional error distributions, most commonly the 
Student-t [38, 3^ 4^. The Student-t distribution models more extended tails than the normal distri- 



bution and is asymptotically a power law with an exponent k equal to the number of its degrees of 
freedom. The GARCH model embodies (exponentially decaying) correlations between successive 
amplitudes (volatility) of the noise but assumes zero correlations between the signs of the successive 
noise innovations (increments). 

Since the probability density function of the Student-t distributed random variable with k de- 
grees of freedom is 

tix,K) = + , (20) 



we have (for n < k) 



{i!: r[(n+i)/2]r[(K-n)/2] mod (k, 2) = 
r(i/2)r{K/2) (21) 
^ mod(K,2) = l 

In particular, for k > 2, we have (x^) = k/ {k — 2). Hence, to meet Eq. (p^, 

z = ^Jk/{k-2) ■ X , (22) 
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whose probability density function is 



K + l 

v2 \ 



fU)= r[(^ + l)/2] 1 + ^1 (23) 
' T[k/2] \ '^-2) • ^^'^ 

z tends towards the standardized Gaussian random variable as k — > oo. 

We have generated GARCH(1,1) noise time series as follows: 1) For a given choice of the 
parameters and of k, generate t random t-distributed random numbers; 2) Obtain zi, Z2, ■ ■ zt from 



(22); 3) Generate iteratively ei, e2, which have zero mean and conditional variance ht, using 

equations (|l^) and (|T8|); 4) A fixed number of the first data points were discarded in order to remove 
any sensitivity on the initial condition. 

4.1.2 Numerical simulations 

We used the parametric values quoted in [||]: ^ = 4.38 x 10"^, a = 2.19 x 10"^, (3 = 0.044, 
7 = 0.922, and /iq = 6.45 x 10"^. In Fig. |lO[ we show four typical simulated GARCH(1,1) noises: 
(a) K = 3, (b) K = 6, (c) K = 9, and (d) k = 12. Local clustering of volatility are clearly visually 
apparent in the figure. 

In order to obtain the significance level of the artificial GARCH(1,1) noises, we simulated a 
large number of data sets. The artificial times were unevenly sampled so that each time followed 
the previous time by a random number between and 1 as in previous simulations. We investigated 



four types of noises with k = 3, 6, 9, 12, as typically shown in Fig. [10|. For each k, we generated 
50,000 data sets, each with 100 data points. We then Lomb transformed the 50,000 time series 
and extracted the maximal Lomb peak height in each periodogram to form a set [^. The results 



are shown in Fig. |11|. To compare with the independent Gaussian noise, we also simulated 50,000 
Gaussian time series and plotted it in the same figure as the open circles. 

For a given peak height, the false-alarm probability increases with n. This is the effect already 
documented in Sec. ^. The effect is weak because k > 3 and the strongest dependence of the 
false-alarm probability on the power law tail exponent k was shown to occur for smaller exponents. 
Note that the false-alarm probability of the GARCH(1, 1) process tends to the Gaussian curve k 
increases. For the largest k investigated, the false-alarm probability is slightly above the Gaussian 
uncorrected value but the difference is small. The effect of correlation of the variance of the 
GARCH(1, 1) process has almost no impact on the false-alarm probability, which is essentially 
controlled by the non-Gaussian character of the distribution function. This is checked by reshuffling 
the GARCH noise to eliminate the correlations in volatility. Performing the same analysis, we 
find the same relation between Lomb peak height and false alarm probability within statistical 
fluctuations. Since the correlations of the variance are destroyed by reshuffling the GARCH(1,1) 
noise while keeping the same heavy tail distribution, we conclude that GARCH(1, 1) correlations 
have essentially no impact on the false-alarm probability of the detection of periodic components. 
The slight difference between the false-alarm probabilities reported in Fig. and in Fig. ^ for 
the uncorrected power law noise with the same tail exponent can thus be attributed only to the 
difference in the bulk of their distributions. 

4.2 Fractional Levy noise 

4.2.1 Method for fractional Levy noise synthesis 

We now combine heavy tails and long-range correlations. For this, we study the fractional Levy 
motion (fLm) [^, 43, 42], which is analogous to fBm [pO|], with the Gaussian distribution of incre- 
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ments replaced by a Levy stable distribution. This corresponds to a moving average with infinite 
memory over a set of independent random variables having a Levy distribution. Let us denote 
B%{t) for fLm. Then, we have B'^j^{t) = B°'{t) and B'jj{t) = Bult). Fractional Levy motion is 
a self-similar process with self-similar index 2H/a > 



(24) 



for a > 0. The symbol denotes that the two processes have the same distribution. The heavy-tail 



behavior of fLm is [42] 



(25) 



In order to generate a fractional noise, there are several methods and the simplest one is to use 



a fractional ARIMA [44]. It is easy to implement the algorithm by replacing the i.i.d. Gaussian 
noise in the Durbin-Levinson method in Sec. ^ with Levy noise [44|. In this section, we applied the 
wavelet method for synthesizing fractional Levy noise. Accurate synthesis of fBm using wavelet is 



based on the wavelet analysis of the second order statistics of fractional Brownian processes [|3 IQ. 
Hence, fBm is generated as a sum of wavelets[32, Similarly, we can use a wavelet transform 



of Levy stable noise and then reconstruct a fractional Levy noise as in the generation of fBm with 
wavelet transform [31, 32, 0]. We use the Daubechies wavelet in the simulations. Thus, 
fractional Levy noise is the series of increments of the fLm. There are other generating methods, 
for instance using Fourier transforms, but the errors are more difficult to control. 



4.2.2 Numerical simulations 

We simulated fractional Levy noises for different Hurst exponents H and heavy-tailness a. For 
each simulation, we fixed the pair {H, a) among 19 x 40 possible values: H ranged from 0.05 
to 0.95 with spacing 0.05, while a varied from 0.05 to 2 with step 0.05. Four typical noises are 



illustrated in Fig. [12|: (a) H = 0.2 and a = 0.4, (b) H = 0.4 and a = 0.8, (c) H = 0.6 and 
a = 1.2, (d) H = 0.8 and a = 1.6. 

To qualify the effect of the interplay between correlations and heavy tails in fractional Levy 
noises, we introduce a characteristic quantity hp defined as the Lomb peak height corresponding 
to a certain false-alarm probability p. We checked that different choice of p resulted in the same 
properties of hp reported below, as long as p is not too small. We generated 1,000 time series each 
with 100 data points for each couple [H, a) of the 19 x 40 possible couples. 



Fig. |13| shows the dependence of /iq.oi as a function of H and a in the fractional Levy noise. 
ho Qi increases both with H and a. This is consistent with previous results obtained for each 
of the ingredients: a large correlation (larger H) increases the Lomb peaks (and thus false-alarm 
probability); a large power law exponent was also found to increase the false-alarm probability 
since this corresponds to weaker fluctuations and thus allows for the random appearance of coherent 
periodic structures. 



5 Joint-distribution of the frequency and height of the highest peak of 
the Lomb power spectrum 

Up to now, we have reported the false-alarm probability as a function of the highest Lomb peak, 
independently of its frequency. In practice, the value of the false-alarm probability as a function 
of frequency is also an important determinant of the statistical significance of a supposed periodic 
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signal. The confidence level of a Lomb peak in a given time series may be increased if one can 
distinguish the frequency corresponding to the highest Lomb peak h"^ of the real signal from the 
most probable frequency /™p of a noise determined from synthetic time series. This approach has 
been used for specific cases in [Q, 10]. 



To address this question, we now investigate the impact of the heavy-tailness of noise distri- 
butions and noise correlation on the two-point statistics of (/, h). We study Levy stable noise and 
fractional Gaussian noise separately, as this is enough to show the main facts and the numerical 
calculations become prohibitive for the combined fractional Levy noises. For each type of noise. 



we synthesize 50,000 series each with 100 data points. The same algorithms as in Sec. ^Jj are used 
for the Levy stable noise and as in Sec. IsTTlfor the fractional Gaussian noise. 



Fig. |14| shows the probability density distribution p{f, h) of the highest Lomb peak height h 
and its associated frequency / for the Levy stable noises, for a = 0.1, 1 and 1.5. In this case of 
uncorrected Levy noise, all frequencies are equiprobable: this is clearly retrieved in the distribution 
for a = 1.5. Large fluctuations for the smaller a's show that the statistical ensemble of these 
simulations is not sufficiently large to reach the asymptotic independence over the frequencies, as 
large bursts of probability h) occur at certain frequencies. The dependence of p{f, h) as a 
function of h recovers our previous results shown in Fig. ^. 

The probability densities p{f\ h) of the highest Lomb peak height h and its associated frequency 



/ for fractional Gaussian noises are shown in Fig. 15, for H = 0.1, 0.5 and 0.9. 



For if = 0.1 and more generally for < ii < 0.5 (Fig. |T5|a), the fGn is anti-persistent, 
i.e., the noise tends to reverse its sign, and thus oscillate fast. We should thus expect and do 
observe periodic components with high frequencies. For instance, in Fig. |8|a generated for 
H = 0.2, one can observe about 30 "cycles" in the noisy time series, resulting in /'"p ~ 0.6. 



• For H = 0.5 (Fig. |15|b), the fGn is white-noise and all frequencies should be equivalent, as 
observed. 

• For H = 0.9 and more generally for 0.5 < if < 1 (Fig. |l|c), the fGn IS persistent, i.e., 
the noise tends to continue along a trend. We should thus expect and do observe periodic 
components with very low frequencies. For instance, in Fig. ^d generated for H = 0.9, one 
can observe about 2 "cycles" in the noisy time series, resulting in ~ 0.02. 

Generally, the number of "cycles" decreases as H increases. Thus, the most probable frequency 
jmp decreases with increasing H, ranging from to 

We have verified in our simulations that the distribution h) for the a = 2 Levy stable noise 
is identical to the distribution p{f, h) obtained for the fractional Gaussian noise with H = 0.5, as it 
should. This provides a confirmation of the validity of the applied algorithms. 



6 Relation with previous works on the detection of log-periodicity 

We now briefly discuss how the simulations presented here shed light on previous announcements 
of the detection of log-periodicity. To apply the numerical results in this work, there are several 
issues that should be addressed in advance. 



6.1 Impact of amplitudes of harmonics 

It is natural that the underlying log-periodic function in the noisy signal is not the pure cosine shown 
in Eq. (|). Then we will see harmonics in the Lomb periodogram. There are several different cases. 
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If the amplitude of the fundamental periodic component dominates, i.e. yi S> Hm for all m > 1, 
P/v(a;i) » Pj\f{LUm). Note that ujm/toi are positive integers. In this situation, we can apply our 
numerical results directly to determine the significance level of the Lomb peak. 

In contrast, if the amplitude(s) of the harmonics is comparable with that of the fundamental 
periodic component, i.e. yi ~ y^ for some m > 1, no less than two comparable peaks coexist in 
the periodogram. In this case, it is more difficult to determining the significance of the peaks, as is 
implied by Eq. (^. 

It is also possible that for some systems there exists at least one harmonic component m such 
that yi ^ yrn leading to Pj\[{loi) <^ Pisi{ujm)- Examples, which we have studied recently will 
be reported elsewhere. They include (1) the log-periodic residues of moments of some self-similar 



multinomial measures [|45|]; (2) (/-derivative [ 146| ] of some "Weierstrass-type" functions [147[]; and 



(3) canonically averaged local log-derivative of the moments of energy dissipation rate in three- 



dimensional fully developed turbulence | |48| ] 



There are other possible origins which might cause several high peaks in one Lomb peri- 
odogram, with low signal-to-noise ratio. If these peaks are disordered, it is hard to conclude that 
periodic component(s) exist. Meanwhile, high noise will also suppress the fundamental peak and 
increase harmonic peaks due to the interaction between signal and noise. 

For the situations that yi » ym fails, all the peaks are not significant if we apply the numerical 
results directly. However, it is well known that evenly spaced high peaks in the power periodogram 
is a strong signature of the existence of a fundamental frequency. We can use a least-square fitting to 



extract the fundamental frequency as proposed in Ref. [ |45| , [48| ]. This is however beyond the theme 
of the present paper. 

6.2 Statistics of highest peaks ratio 

In practice, it is always difficult to judge if a given Lomb peak is significant or not, especially 
when the nature of the noise is unknown. Then, the absolute amplitude of the highest Lomb peak 
is difficult to interpret and translate into a false-alarm probability and into a confidence level. It is 
then natural to introduce a measure of relative significance, for instance by using the ratio of the 



first highest Lomb peak to the second highest peak. Fig. 16 and Fig. 17 show the complementary 
cumulative distributions of the ratio of the two highest Lomb peaks for Levy stable noise and frac- 
tional Gaussian noise. Again, the number of samples is 50,000 each with 100 data points. The open 
circles in the two figures correspond to Gaussian noise and the slight discrepancy is caused by the 
application of different algorithms. The complementary cumulative (false-alarm) probability of a 
given peak increases with a for Levy stable noise, which corroborates our previous statistics on 
the absolute height of Lomb peaks: for small exponent a, it is very improbable to observe a large 
ratio of the two highest Lomb peaks. Thus, a signal showing a large ratio would qualify as highly 
significant. 

The behavior of fGn is non-monotonous. The false-alarm (complementary cumulative) prob- 
ability decreases with H in the "anti -persistence" regime < < 0.5 and increases with H in 
the persistence regime 0.5 < H < 1. Thus, both mechanisms, which leads to deviations from 
uncorrected randomness, lead to an increase the false-alarm probability. 

6.3 Application to real systems 

We now review how the present analysis shed light on our previous works that have used the Lomb 
analysis to attempt qualifying the presence of log-periodicity in different complex systems. 
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Ref. [ )49| ] presents Lomb periodograms of the residual of the fit of the distribution of crack 
lengths by a pure power law. Since the fit is performed with a power law, the natural variable is 
the logarithm ln£ of the crack lengths: a detection of a periodic component in the residual of the 
distribution of the crack lengths as a function of ln£ would qualify log-periodicity. The Lomb 
periodogram shown in Fig. 14 of p9| ] for a model of crack growth has a very high peak of 90 
with very narrow width and is highly significant, whatever the nature of the noise, Gaussian, Levy, 
power law or even with long-range correlations. The Lomb periodogram shown in Fig. 16 of [ p9| ] 
for geological data on the distribution of joint length has two peaks at a level between 45 and 50 
which are so close as being barely distinguishable. All the other peaks are much smaller not above 
13. These two peaks are very significant, even in the presence of the most unfavorable case of 
highly persistent noise. 

Ref. [50] analyzed experimental data on the time evolution of the number, size and separation 
of vortices in freely decaying 2-d turbulence to investigate the existence of a discrete time scale 
invariance, which could reflect that the time-evolution of the merging of vortices is not smooth but 
punctuated, leading to a preferred scale factor and as a consequence to log-periodic oscillations. 
Three Lomb periodograms were reported which were averaged over 7-10 realizations of an exper- 
iment on freely decaying turbulence: Figs. 5 and 6 of Ref. [50] show the Lomb periodogram of 
the logarithmic derivative of the number of vortices with respect to time; Figs. 9 and 10 show the 
Lomb periodogram of the logarithmic derivative of the separation between vortices with respect to 
time; Figs. 13 and 14 show the Lomb periodogram of the logarithmic derivative of the mean radius 
of vortices with respect to time. In all cases, the maximum height is no more than about 3.5. The 
present study suggests that only for heavy-tailed noise with small exponent aor n will each Lomb 
periodogram achieve a reasonable statistical significance. 

In [pO|], it was argued that the case for log-periodicity was not based on the evidence obtained 
for each isolated data set but from the collective evidence that the maximum peaks were found for 
each data set at about the same log-frequency around 4 — 5 corresponding to a preferred scaling 
ratio 1.3. This log-frequency of 4 — 5 is much higher than the frequency one would expect from 
the size of the interval (see Fig. 4 of [ 5C ] and the value of the most probable frequency due to noise 
equal to 1.5/1.8 = 0.8) Thus, statistical significance is argued to be based on the coincidence of 
three highest peaks of the three Lomb periodograms which occur at frequencies far from the most 
probable frequencies found in our studies. Taking a number of points around 35 in these data, the 
signal-to-noise ratio is around y^/2cr^ = 0.2 according to Eq. (^. This might be the reason for 
the low Lomb peaks in the periodograms. On the other hand, we cannot find support from the 
statistics proposed in Sec. 6^. A further note of caution is necessary: the curves in figures 4, 8, 
and 12 of Ref. [ |50| ] are quite similar to fractional Gaussian noise with H 0.3. According to this 
alternative interpretation, long-range correlations could characterize the this turbulence data, rather 
than log-periodicity. This example shows the difficulty and complication of determination of the 
significance level of a given peak in practice. 

Ref. [ pT| ] reports in its Fig. 1 five Lomb periodograms of five time-dependent ion concentra- 
tions in water close to the epicenter of the Kobe earthquake of Jan. 17, 1995 as a function of log- 
frequency. The two largest peaks are of height 13 and 11. According to the uncorrelated Gaussian 
benchmark, these two peaks give a false-alarm probability of about or less than 10^^, suggest- 
ing a very strong significance for a genuine log-periodicity. The possible existence of tail-tailed 
noise reinforce even further this case. However, one cannot exclude the possible existence of some 
long-range correlation in the ion concentrations, for instance associated with the time-dependent 
evolution of water permeability. Our present study shows that going from a white-noise spectrum 
to a 1// spectrum transform the false-alarm probabihty from 10~^ to about 10 — 20%. 
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References ||7|, ^ |8[ 53, 54] report several Lomb periodograms performed to qualify the ex- 
istence of log-periodicity in the price dynamics of speculative bubbles preceding large financial 
crashes. The Lomb power spectra are normalized such that the highest peak has a height equal 
to 1. Therefore, we cannot establish the corresponding false-alarm probability, using the different 
models of noises investigated here. It was argued that the correct method, in absence of information 
on the underlying noise, is the establish the statistical significance on the basis of a large difference 
between the highest peak and the background. Figure 17 shows that a white-noise null hypothesis 
requires a ratio larger than about 3 to qualify a genuine log-periodic component at the 99% confi- 
dence level. In the worst and unrealistic scenario where the noise exhibits long-range correlations 
with persistence, the ratio must be larger. For instance, figure 17 indicates that the ratio must reach 
5 if the Hurst exponent is 0.8 or smaller. However, financial time series are known to have very 
short-ranged and weak correlations in their return and it is thus hard to believe that strong persis- 
tence with large Hurst exponents can characterize the statistics of the returns. We stress here that 
we are not referring to the persistence property of the absolute value of the returns (or volatility) but 
to that of the returns themselves which is an entirely different problem. Many of the ratios of the 
larger peak to the second largest one reported in the Lomb periodograms of [0, ||, |^ are 
of the order or larger than 3, some reach 5 or more. This signals a high statistical significance. In 
some instance, the ratio is not large, but it turns out that the second highest peak is associated with a 
log-frequency harmonic to the log-frequency of the first peak. In this case, the statistics of the ratio 
does not apply as the existence of harmonics supports the conclusion they are not generated from 
noise. Hence, the statistical analysis of the ratio of the first highest peak to the second one confirms 
the confidence of the existence of log-periodicity in finance crashes. 



7 Conclusions 

We have presented statistical tests on the false-alarm probability that a spectral peak in the Lomb 
periodogram analysis of noisy periodic signal may result from noise. In order to mimic the large 
variety of noises that may be present in natural and social data, we have investigated several types 
of noises, including noises with power law distributions and with short and long-range correlations. 
The false-alarm probability of a periodic component is found to be strongly dependent on the nature 
of the noise. This underlines the difficulty in concluding unambiguously on the existence of a 
genuine log-periodicity in noisy signals when the noise properties are not known a priori and are 
thus difficult to distinguish from the signal. In the light of the statistical tests performed here, we 
have briefly reviewed the evidences presented in past works on the existence of log-periodicity in 
turbulence, earthquake and financial data. Our present study weakens the credibility for 2D-freely 
decaying turbulence and strengthens it for fracture, for the ion-signature precursors to the Kobe 
earthquake and for financial markets. 

We hope that the present work will help in assessing more reliably the existence of periodicity 
in noisy complex time series and will provide useful guidelines to test new data sets. 
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Figure 1 : Surrogate Levy stable noises each with 10,000 data points for: (a) a = 2; (b) a = 1.5; (c) a = 1; 
and (d) a = 0.5. Here, A = 0, = and a = 1/ \f2. Note the difference in the vertical scales of the four 
panels. 
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Figure 2: The four Lomb periodograms corresponding to the four cases shown in Fig. |l]but with time 
of only 100 points. 
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Figure 3: False-alarm probability of the surrogate Levy noises for a ranging from 0.1 to 2 with step 0.1 
from left to right. For a given Lomb peak height, the false-alarm probability increases with a. 
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gure 4: Surrogate symmetrical power-law noises each with 10,000 data points for: (a) k = 5; (b) 
K= 1.5; and (d) k = 0.5. Note the difference in the vertical scales of the four panels. 
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Figure 5: False-alarm probability of the surrogate power-law tailed noise for n ranging from from 0.1 to 2 
with spacing 0.1, from 2.5 to 4 with spacing 0.5, and for 5 and 6, from left to right. For a given Lomb peak 
height, the false-alarm probability increases with k. The line marked with stars is for k = 2 while the circles 
correspond to the Gaussian case. 
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Figure 6: Dependence of the maximal Lomb peak PAr(a;) averaged over 50,000 realizations of 100 points 
as a function of (t^ given by expression ([l3[). 
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Figure 7: Dependence of P/v(w) as a function of the power law exponent a. 
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Figure 8: Surrogate fractional Gaussian noises using 
data points whose Hurst indexes are: (a) H = 0.2, (b) 
increase of H, the fGn exhibits stronger regularity. 
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Figure 9: The relationship between the Lomb peak height and the significant level of 50,000 synthetic noises 
with 100 data points. The Hurst index increases from 0.1 to 0.9 with spacing 0.1 from left to right. 
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Figure 10: Surrogate GARCH(1,1) noises using Eqs. ( |17| ) with: (a) k = 3, (b) k = 6, (c) k 
and (d) k = 12. 
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Figure 11: Significance levels of the surrogate GARCH(1,1) data for k = 3, 6, 9, 12 and comparison with 
the independent Gaussian noise. 
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Figure 12: Fractional Levy noises generated with the wavelet method using Daubechies wavelets. 
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Figure 13: Dependence of /lo.oi on H and a in fractional levy noise, /lo.oi increases with H (respectively 
a) for fixed a (respectively H). Two-dimensional interpolations were carried out to smoothen the surface. 
The residual oscillations stem from statistical fluctuations. 
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Figure 14: Probability density p{f, h) of the highest Lomb peak height h and its associated frequency / for 
Levy stable noises: (a) a = 0.1, (b) a = 1, and (c) a = 1.5. 
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Figure 15: Three types of probability density p{f, h) of the highest peak for fractional Gaussian noise: (a) 
H = 0.1, (b) H = 0.5, and (c) H = 0.9. 
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Figure 16: Complementary cumulative distribution of the ratio of the highest Lomb peaks for Levy stable 
noise with a. varying from 0.1 to 2 as indicated by the arrow. The open circles corresponds to a = 2. 
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Figure 17: Complementary cumulative (false-alarm) distribution of the ratio of the highest Lomb peaks for 
fractional Gaussian noise with H varying from 0.1 to 0.9. The open circles, sohd hnes and dashed lines 
correspond respectively to H = 0.5, < H < 0.5 and 0.5 < H < 1. 
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